Calculate a Covariance Matrix
Remember how we defined the covariance matrix:
P=[Cov(rA,rA)Cov(rB,rA)Cov(rA,rB)Cov(rB,rB)].
And covariance is
Cov(rA,rB)=E[(rA−rˉA)(rB−rˉB)].
If
rA
and
rB
are discrete vectors of values, that is, they can take on the values
(rAi,rBi)
for
i=1,…,n
, with equal probabilities 1/n, then the covariance can be equivalently written,
=n−11∑i=1n(rAi−rˉA)(rBi−rˉB).
We use
n−1
in the denominator of the constant for the same reason that we use
n−1
in the denominator of the constant out front in the sample standard deviation—because we have a sample, and we want to calculate an
unbiased
estimate of the population covariance.
But if
rˉA=rˉB=0
, then the covariance equals
=n−11∑i=1nrAirBi.
In matrix notation, this equals
n−11rATrB.
Therefore, if
r
is a matrix that contains the vectors
rA
and
rB
as its columns,
r=⎣⎢⎢⎡⋮rA⋮⋮rB⋮⎦⎥⎥⎤,
then
rTr=[⋯rA⋯ ⋯rB⋯][⋮⋮ rArB ⋮⋮]=[rATrArATrB rBTrArBTrB].
So if each vector of observations in your data matrix has mean 0, you can calculate the covariance matrix as:
n−11rTr